Scaling Up Mixed Workloads: A Battle of Data Freshness, Flexibility, and Scheduling
نویسندگان
چکیده
The common “one size does not fit all” paradigm isolates transactional and analytical workloads into separate, specialized database systems. Operational data is periodically replicated to a data warehouse for analytics. Competitiveness of enterprises today, however, depends on real-time reporting on operational data, necessitating an integration of transactional and analytical processing in a single database system. The mixed workload should be able to query and modify common data in a shared schema. The database needs to provide performance guarantees for transactional workloads, and, at the same time, efficiently evaluate complex analytical queries. In this paper, we share our analysis of the performance of two main-memory databases that support mixed workloads, SAP HANA and HyPer, while evaluating the mixed workload CHbenCHmark. By examining their similarities and differences, we identify the factors that affect performance while scaling the number of concurrent transactional and analytical clients. The three main factors are (a) data freshness, i.e., how recent is the data processed by analytical queries, (b) flexibility, i.e., restricting transactional features in order to increase optimization choices and enhance performance, and (c) scheduling, i.e., how the mixed workload utilizes resources. Specifically for scheduling, we show that the absence of workload management under cases of high concurrency leads to analytical workloads overwhelming the system and severely hurting the performance of transactional workloads.
منابع مشابه
Scaling Up a Strengthened Youth-Friendly Service Delivery Model to Include Long-Acting Reversible Contraceptives in Ethiopia: A Mixed Methods Retrospective Assessment
Background Donor funded projects are small scale and time limited, with gains that soon dissipate when donor funds end. This paper presents findings that sought to understand successes, challenges and barriers that influence the scaling up and sustainability of a tested, strengthened youth-friendly service (YFS) delivery model providing an expanded contraceptive method choice in one locat...
متن کاملA fuzzy mixed-integer goal programming model for a parallel machine scheduling problem with sequence-dependent setup times and release dates
This paper presents a new mixed-integer goal programming (MIGP) model for a parallel machine scheduling problem with sequence-dependent setup times and release dates. Two objectives are considered in the model to minimize the total weighted flow time and the total weighted tardiness simultaneously. Due to the com-plexity of the above model and uncertainty involved in real-world scheduling probl...
متن کاملMinimizing the maximum tardiness and makespan criteria in a job shop scheduling problem with sequence dependent setup times
The job shop scheduling problem (JSP) is one of the most difficult problems in traditional scheduling because any job consists of a set operations and also any operation processes by a machine. Whereas the operation is placed in the machine, it is essential to be considering setup times that the times strongly depend on the various sequencing of jobs on the machines. This research is developed ...
متن کاملAdaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores
Non-uniform memory access (NUMA) architectures pose numerous performance challenges for main-memory column-stores in scaling up analytics on modern multi-socket multi-core servers. A NUMAaware execution engine needs a strategy for data placement and task scheduling that prefers fast local memory accesses over remote memory accesses, and avoids an imbalance of resource utilization, both CPU and ...
متن کاملFreshness-Aware Scheduling of Continuous Queries in the Dynamic Web
The dynamics of the Web and the demand for new, active services are imposing new requirements on Web servers. One such new service is the processing of continuous queries whose output data stream can be used to support the personalization of individual user’s web pages. In this paper, we are proposing a new scheduling policy for continuous queries with the objective of maximizing the freshness ...
متن کامل